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Abstract.  An  m  x  m  symmetric  nonnegative  definite  matrix  E  has  Cholesky  factorization  E  = 
UTU.  By  carrying  out  the  factorization  in  a  particular  way  for  positive  definite  E,  the  Schur 
complements  of  all  the  leading  principal  submatrices  of  E-  are  produced,  as  well  as  their  Cholesky 
factors.  It  is  shown  how  the  same  can  be  done  for  generalized  Schur  complements  when  E  is 
singular.  When  E  is  the  population  covariance  matrix  of  a  multivariate  random  distribution,  partial 
covariances  and  correlations  can  be  defined  in  terms  of  the  elements  of  such  Schur  complements. 
It  follows  that  these  can  be  produced  efficiently  and  reliably  from  the  Cholesky  factorization. 

When  n  x  m  A  is  given  and  E  =  AT A,  the  Cholesky  factor  U  may  be  found  directly  from  the 
QR  factorization  A  =  Q\U ,  Q\Qi  =  /,  and  this  is  preferable  in  many  numerical  computations. 
This  QR  factorization,  or  the  modified  Gram-Schmidt  orthogonalization,  produces  projections  of 
later  columns  of  A  onto  spaces  orthogonal  to  earlier  columns.  It  is  shown  how  the  cosines  of 
the  angles  between  suck  projected  vectors  can  be  found  using  the  elements  of  U .  These  cosines 
produced  from  A  turn  out  to  be  the  previously  mentioned  partial  correlation  coefficients  produced 
from  E,  when  E  =  AT A.  When  A  is  obtained  from  observations  of  random  variables,  these  are 
the  sample  correlation  coefficients.  It  is  shown  how  such  correlation  coefficients  can  be  efficiently 
obtained  when  observations  are  added  or  deleted.  This  corresponds  to  altering  all  of  A  in  a  certain 
simple  way,  and  adding  or  deleting  rows. 
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Notation 


We  will  be  dealing  with  vectors  and  matrices  in  their  own  right,  as  well  as  treating  them 
as  statistical  objects.  Because  of  the  very  different  notation  used  in  econometrics,  statistics,  and 
other  areas  which  use  matrix  theory,  we  will  use  a  standard  notation  of  matrix  theory  and  describe 
the  convention  we  use  within  this  for  representing  statistical  objects.  We  describe  these  fully  for 
possibly  different  audiences. 

Matrix  Theory  Notation.  Upper  case  greek  and  italic  letters,  like  E  and  A,  will  denote  matrices; 
lower  case  italics,  like  a,  and  x,  will  denote  vectors;  except  that  »,  j,  1 fc,  m,  n,  p,  q,  r,  s,  t  will 
represent  integers  or  indices.  Lower  case  greek  letters  will  denote  other  scalars.  Thus, 

A{n  x  m)  =  ( oj  ...  am  )  =  (a,,) 

represents  an  n  x  m  matrix  A  with  columns  <n,. . .  ,am  and  elements 

A  principal  submatrix  of  £  is  a  square  submatrix  whose  diagonal  elements  are  also  diagonal 
elements  of  £.  A  leading  principal  submatrix  of  E  is  a  principal  submatrix  in  the  top  left  corner  of 
E. 

We  will  make  frequent  use  of  the  following  notions:  £r  denotes  the  transpose  of  £;  5Rn  is  the 
real  n-space  and  ||x||2  =  V zTx  the  two-norm  of  a  vector  x.  The  range  or  image  or  column  space 
of  a  matrix  A  is  denoted  by  f?(A),  and  its  orthogonal  complement  by  f?(A)x.  0(a, fc)  is  the  angle 
between  two  vectors  a  and  b  of  equal  dimension. 


Statistical  Notation.  This  will  always  follow  the  matrix  notation  above.  We  will  use  the  same 
notation  for  a  random  variable  and  an  instance  or  observation  of  that  variable.  Thus  x  can  represent 
a  random  vector  or  an  observation  of  this  random  vector,  the  meaning  will  be  clear  from  the  context. 
E(x)  denotes  the  expected  value,  and  x  ~  (a,  E)  denotes  a  vector  x  of  random  variables  with  mean  a 
and  variance-covariance  matrix  (covariance  for  short)  E. 


1.  Introduction 

For  a  given  matrix 


with  nonsingular  E,  the  Schur  complement  of  E  in  E,  often  written  (E/J£),  is 

S  =  (£/£)  =  H  -GE~lF.  (1.2) 


There  is  an  extensive  literature  on  this,  and  it  is  a  theoretical  tool  widely  used  by  statisticians  as 
can  be  seen  from  the  bibliography  prepared  by  Ouellette  [12],  see  also  [2,  3].  Perhaps  the  most 
common  use  by  statisticians  of  the  Schur  complement  occurs  when  E  is  a  symmetric  positive  definite 
covariance  matrix,  and  the  Schur  complement  of  a  principal  submatrix  is  required.  In  this  case 
there  is  a  simple  relation  between  Schur  complements  and  Cholesky  factors.  Suppose  we  partition 
the  matrices 


where  U  is  the  upper  triangular  Cholesky  factor  of  E.  It  follows  that 
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so  that  the  Cholesky  factor  of  a  symmetric  positive  definite  E  not  only  provides  the  Cholesky 
factors  of  all  the  leading  principal  submatrices  En  =  U^Uu  of  E,  but  also  the  Cholesky  factors 
of  the  Schur  complements  of  all  the  leading  principal  submatrices  (E/Ea)  =  U^U-a  in  E.  By 
permuting  rows,  and  permuting  columns  in  the  same  way,  any  principal  Bubmatrix  can  be  made  a 
leading  principal  submatrix,  and  the  above  well  known  result  suggests  we  might  be  able  to  use  the 
Cholesky  factor  and  ignore  the  Schur  complement.  This  is  not  quite  true,  as  we  now  point  out. 

When  E  is  the  covariance  matrix  of  a  multivariate  distribution,  the  population  partial  correla¬ 
tion  coefficients  are  defined  in  terms  of  the  elements  of  Schur  complements  of  principal  submatrices 
of  E,  see  [1]  pp.  37,41,  so  these  Schur  complements  are  really  needed.  In  §2  we  show  that  if 
the  Cholesky  factorization  is  carried  out  in  a  particular  order,  then  the  Schur  complements  of  all 
leading  principal  submatrices  occur  naturally  as  part  of  the  factorization;  that  is,  without  having 
to  form  C/£C/„. 

When  E  is  singular  or  not  square  in  (1.1)  Marsaglia  and  Styan  [11]  define  a  generalized  Schur 
complement 

S  =  (E/E)  =  H  -  GE~F,  EE~E=  E.  (1.5) 

When  E  is  symmetric  nonnegative  definite  and  possibly  singular,  this  generalized  Schur  complement 
is  unique  and  symmetric  nonnegative  definite,  and  we  show  in  §2  how  it,  and  its  Cholesky  factor, 
occur  naturally  as  part  of  a  particular  Cholesky  factorization  of  E.  In  fact  the  form  and  the 
factorization  is  as  simple  as  that  for  the  positive  definite  case.  When  it  is  also  realized  that  such 
algorithms  for  producing  Cholesky  factors  are  very  efficient,  and  are  numerically  reliable  when  E 
is  positive  definite  (see  for  example  [8],  p.  89),  and  can  nearly  always  be  made  reliable  when  E  is 
singular,  see  [9],  and  that  results  involving  (generalized)  Schur  complements  of  principal  submatrices 
are  easily  derived  using  Cholesky  factors,  the  argument  for  thinking  largely  in  terms  of  Cholesky 
factors  becomes  quite  strong,  especially  for  those  interested  in  computations. 

In  §3  we  discuss  correlation  coefficients,  partial  correlation  coefficients,  and  conditional  cor¬ 
relation  coefficients,  and  give  an  efficient  and  reliable  way  of  obtaining  these  during  the  Cholesky 
factorization  of  the  given  covariance  matrix  E.  Readers  not  interested  in  such  statistical  objects 
need  only  note  that  the  aim  is  to  obtain  the  coefficients  p ^  in  (3.5),  and  this  is  done  from  the 

elements  trj-'J  of  E^  =  in  (1.4). 

The  angle  8 (a,b)  between  two  nonzero  vectors  a,  6  E  88"  is  defined  by 


cos  0(a,  6)  = 


\\a\m\2' 


0  <  8(a,  b)  <  x, 


and  §4  relates  this  to  some  orthogonal  transformations. 

It  is  possible  that  A  is  available  where  the  covariance  matrix  is  E  =  AT A,  and  §5  contains 
the  main  theorem  (Theorem  5.1)  which  relates  the  Cholesky  factorization  and  generalized  Schur 
complements  of  E  in  §2,  and  the  covariance  matrices  and  partial  correlation  coefficients  of  §3,  with 
the  angles  between  vectors  arising  in  the  QR  factorization  of  A  in  §5;  the  initial  work  on  this  was 
reported  in  [5j.  We  now  expand  on  this. 

We  know  that  it  is  not  generally  advisable  to  form  ATA  numerically  if  we  have  A  available, 
and  that  the  Cholesky  factor  U  of  E  =  AT A  =  UTU  can  more  reliably  be  obtained  as  the  upper 
triangular  matrix  U  in  the  QR  factorization  of  A  (see  for  example  [8],  Chapter  6) 


QQt  =  QtQ  =  I , 


A  =  QiU ,  Q  =  (Qi  Qt). 


The  question  now  arises  as  to  whether  the  Schur  complements  of  E,  or  their  factors,  and  the 
corresponding  partial  correlation  coefficients,  can  be  obtained  directly  from  the  QR  factorization 
of  A.  Section  5  answers  this  by  considering  how  algorithms  for  the  QR  factorization  of  A  suc¬ 
cessively  produce  the  orthogonal  projection  of  a,-  onto  the  space  orthogonal  to  ai,  R((  aj  a *)), 

...,  R{(ai  ...  a,_i )),  for  each  2  <  j  <  m.  The  scalar  (the  partial  correlation  of  and 
keeping  fixed),  t  <  j,k  <  m,  is  shown  to  be  the  cosine  of  the  angle  between  the 

projections  of  ay  and  a*  onto  i?((  a\  ...  a,_i  J)-1.  It  is  shown  how  such  correlations  can  be  found 
from  the  QR  factorization  of  A  without  forming  AT A  or  UTU.  The  Cholesky  factors  of  the  Schur 
complements  of  leading  principal  submatrices  of  E  =  AT  A  can  also  be  produced  directly. 

In  §6  we  consider  the  special  case  of  having  observations  of  random  variables,  and  estimating 
sample  partial  correlation  coefficients  from  these.  Thus,  when  z\,  xj  €  9fn  are  vectors  of  n  obser¬ 
vations  on  two  possibly  related  random  variables  £i,  and  £j,  and  their  means  are  subtracted  to 
give 

Oi  =  x i-c(x?e)/n,  *  =  1,2,  e  =  (l  ...  1  )T ,  (1.8) 


then  pa  =  cos0(ai,aj)  is  called  the  sample  correlation  coefficient  between  £i  and  £2.  For  n 
observations  on  each  of  m  such  random  variables  £1,  •  • . ,  (m,  the  means  can  be  subtracted  to  give 
A  =  ( ai  ...  am  ),  and  if 


E  = 


(0\\  •••  ^ltn 

^ml  •  •  •  &  mm 


=  ata, 


(1.9) 


the  sample  correlation  coefficients  are 


Pi]  =  -  xfpi/i ,  when  OaCjj  ^  0,  1  <i,j<m.  (1.10) 

aii  an 

The  sample  partial  correlation  coefficients  are  then  defined  in  terms  of  the  elements  of  the 
Schur  complements  of  principal  submatrices  of  E,  in  a  similar  manner  to  the  population  partial 
correlation  coefficients  mentioned  earlier.  Note  that  (n  -  1  )~1ATA  is  usually  taken  as  the  estimate 
for  the  population  covariance  matrix,  but  we  will  be  able  to  ignore  the  scaling  since  correlations 
are  independent  of  scaling.  By  an  intelligent  representation  of  the  data  we  show  that  the  computed 
results  may  be  updated  efficiently  for  certain  changes  in  the  data. 

In  §7  we  consider  the  costs  of  computations,  and  how  sets  of  partial  correlation  coefficients 
could  be  computed.  We  illustrate  the  loss  of  accuracy  caused  by  forming  E  =  AT A  in  finite 
precision  when  A  is  given,  so  that  working  with  A  could  be  important  in  some  cases  despite  the 
greater  efficiency  of  working  with  E. 

In  §8  we  try  to  summarize  the  various  results  of  the  paper  and  emphasize  the  relations  among 
them. 


2.  Cholesky  Factors  and  Generalized  Schur  Complements 

If  E(m  x  m)  is  symmetric  nonnegative  definite  it  has  a  Cholesky  factorization  with  upper 
triangular  factor  U : 

(on  ...  (Tim  \  /  «f  \  /  Mu  •  •  •  Mi  m  \ 

:  :  I  =  UTU,  U=  :  =  j  ,  >  0.  (2.1) 

^ml  •  •  -  ffmm  /  \  J  V  Mmm  J 

If  E  is  positive  definite  this  is  unique,  but  if  E  is  singular  it  need  not  be.  We  will  make  it  unique 

(see  for  example  [10],  p.  124)  by  demanding 

0  =  M.,,+  1  =  •••  =  Mim 


if  pa  =  0. 


(2.2) 


Clearly  the  first  row  u(  of  U  can  be  found  from  (2.1)  and  (2.2),  and  if  we  have  found  rows  uf,  . . 
uf_  j ,  we  can  find  the  ith  row  from  E^  in 


V  T  T  ( 0  0  \ 

E-ulUl  -  ...  -«i_1t£1=  s(t)J, 


E(*>  = 


a?!>  ... 

•t  «m 


*mi  •••  *mm 


Mitn  •  •  •  Mmm 


Mti  ■  •  •  M*«i 


The  particular  variant  of  the  Cholesky  algorithm  we  are  interested  in  then  has  the  form,  with 
=  eri},  (note  that  we  need  only  compute  the  upper  triangular  part), 

Cholesky  Algorithm 


i  =  1 , . . . ,  m  *,  :=  (^V* 
k  =  t  +  1 , . . . ,  m 


j  =  *  +  l,...,fc 


if  /!,,  =  0 

otherwise 

(i+i)  (.) 

a)k  ’  ■=  <r)k  ~  HjHik 


Here,  the  can  overwrite  the  ffj'J ,  and  the  can  overwrite  the  ffj'J .  Thus,  the  elements  of 

E  =  E^),  E<J\  ....  E^"*)  =  are  all  formed  by  the  algorithm.  This  is  essentially  the  basic 
structure  of  algorithm  SCHDC  in  LINPACK,  see  §8  and  Appendix  C.99-C.102  in  [6], 

If  we  partition  E  as  in  (1.3)  with  («'  -  1)  X  (»'  -  1)  En,  then  from  (2.3) 

E<’>  =  U?2U22,  (2.5) 

and  if  En  is  nonsingular,  (1.4)  shows  that  E^  is  the  Schur  complement  of  En  in  E.  If  En  is 
singular  a  generalized  Schur  complement  is,  see  (1.5),  (1.1)  and  (1.3), 

S  =  Ejj  —  EnE-En  =  Uf2U2i  +  Uj2Uu  -  Ul2Un(U^Un)~U^U12,  (2.6) 

where  E^  is  any  generalized  inverse  of  En  satisfying 

EnSnEu  =  En  =  UTUMU'UnruTUn.  (2.7) 

But  our  choice  (2.2)  ensures  in  (1.3)  that 

mUii  Vl2))  =  R(Un),  (2.8) 

so  U12  =  UiiBi2  for  some  Bu,  and  substituting  this  in  (2.6)  and  using  (2.7)  shows  that  (2.5)  is  the 
generalized  Schur  complement  of  singular  En  in  E. 

Now  from  (2.8) 

*((En  Xii))  =  Wn(Uu  Ul2 ))  =  R^Un)  =  *(E„), 

and  this  is  the  necessary  and  sufficient  condition  for  the  generalized  Schur  complement  of  En 
in  symmetric  E  to  be  independent  of  the  choice  of  generalized  inverse  EJ'j,  see  for  example  [2], 
Proposition  1.  We  can  now  summarize  this  result. 


Result  2.1.  For  any  m  x  m  symmetric  nonnegative  definite  matrix  £  with  elements  the 

Cholesky  algorithm  (2.4)  produces  £(')  in  (2.3),  i  =  2 as  the  unique  (generalized)  Schur 
complement  of  tie  leading  principal  i  —  1  square  submatrix  in  £.  The  algorithm  then  proceeds  to 
End  the  Cholesky  factor,  see  (2.3),  of  this  generalized  Schur  complement. 

This  is  a  simple  and  clean  theoretical  result,  involving  no  generalised  inverses,  just  a  variant 
of  the  Cholesky  factorization.  It  provides  a  strong  argument  for  using  this  Cholesky  factor  as  a 
powerful  theoretical  tool  in  this  area. 


3.  Covariance  Matrices  and  Correlation  Coefficients 

If  i  =  ( £i  •  •  ■  £m  )T  is  a  vector  of  random  variables  with  x  ~  (a,£),  which  indicates  x  has 


E\x\  =  a  =  ( ai  ...  am  )5 


and  covariance 


E[(x  -  a)(x  -  a)T]  =  £  = 


then  a  good  measure  of  the  degree  of  dependence  between  and  is  their  covariance  standardized 
by  their  standard  deviations.  This  is  called  their  correlation  coefficient 

.  -  gK&  -  <*)(&•  -  aJ-)l  _  °a 

'  {£:[(£••  -  a.)>]F[({,  -  «,)>!}>/!  1  ' 

Thus  all  the  correlation  coefficients  are  obtained  directly  from  the  covariance  matrix. 

If  x  and  £  are  partitioned  conformably 


-(:)■  >■(! 


£u  »  («  -  i)  x  («  -  i), 


and  £  is  nonsingular,  then  Anderson  [1],  p.  41  states  that  the  partial  covariance  of  X2  given  xi 
can  be  defined  as  the  covariance  of  the  residual  of  xj  from  its  regression  on  x\.  Anderson  [l]  §2.5 
shows  that  this  is  just  the  Schur  complement  of  £u  in  £,  which  we  have  shown  is  £(*)  in  (2.5),  the 
elements  of  which  are  the  erj]}  produced  in  (2.4).  The  partial  correlation  coefficients  are  defined 
in  terms  of  the  elements  of  this  partial  covariance  matrix,  just  as  was  done  in  (3.3)  for  ordinary 
correlation  coefficients. 

Definition  3.1.  The  partial  correlation  between  and  £*,  j,k  =  «, . . . ,  m,  given  &,  . . .,  £,_i  is 
defined  to  be 

-(«) 

p\')  =  — pr - -*  ... - ,  <7!^°rk*  #  0,  (3.5) 

and  is  a  measure  of  dependence  between  and  when  the  effects  of  £i,  . . .,  &_i  have  been 
removed. 

The  work  here  has  shown  that  a  numerically  reliable  and  efficient  way  of  obtaining  these  is  to 
use  (3.5)  once  the  crj'^  have  been  obtained  directly  from  the  Cholesky  factorization  (2.4).  If  z  also 
has  a  multivariate  normal  distribution,  so  that  x  ~  (a,E)  completely  determines  the  distribution, 
then  Anderson  [1]  §2.5  shows  that  £(')  is  the  conditional  covariance  of  xj  given  X\,  and  then  we 
can  call  (3.5)  a  conditional  correlation. 


The  nonzero  condition  in  (3.5)  automatically  holds  when  E  is  nonsingular,  but  we  have  included 
it  for  full  generality.  If  E  is  singular  and  =  0,  then  it  is  straightforward  to  show  a =  0, 
k  —  i, . . . ,  m,  and  is  undefined. 

Although  this  is  not  a  statistical  paper,  we  will  now  show  that  the  generalized  Schur  comple¬ 
ment  E(')  is  the  partial  covariance  matrix  of  x2  given  x\,  when  En  is  singular.  We  do  this  to  show 
the  full  generality  of  the  results  here,  and  to  illustrate  how  easy  it  is  to  prove  such  results  just  using 
the  Cholesky  factorization  (2.4)  which  provides  and  its  factors  so  elegantly. 

If  x  ~  (a,E)  and  the  Cholesky  factor  U  of  E  =  UTU  is  as  in  (2.4),  then  we  may  remove  the 
zero  rows  of  U  to  obtain  U  of  full  row  rank  and  such  that  E  =  UTU .  It  follows  from  Anderson  [1], 
pp.  32-33  that  with  probability  1 

*= (*0 = (°0+(4  4)  {ii)=a+°Tv’ 

where  U21  and  U22  each  have  full  column  rank.  If  x\  is  given  (such  xj  must  lie  in  the  linear  variety 
oi  +  V\\V\,  for  some  uj,  to  be  consistent)  then  is  completely  determined,  and 

x2  =  b2  +  Of2v2,  bi  =  a2  -I-  &f2vi  given,  v2  ~  (0,  /). 

Now  using  the  superscript  +  to  denote  the  Moore-Penrose  generalized  inverse, 

b2  =  a2  +  U?2{U?1)+(x1-a1) 

can  be  defined  as  the  regression  of  x2  on  x2  ([1],  p.41),  so  the  residual  Uf2v2  ~  (O.E^),  and  the 
partial  covariance  matrix  is  E^. 

It  follows  that  (3.5)  can  also  be  used  to  define  the  partial  correlation  coefficients  when  Eu 
is  singular,  and  so  the  Cholesky  factorization  (2.4)  produces  the  partial  covariance  matrices  E^ 
and  its  factors,  see  (2.3),  and  then  the  partial  correlation  coefficients  are  given  by  (3.5),  for  any 
symmetric  nonnegative  definite  covariance  matrix  E  =  (<rj^). 

4.  Angles  Between  Vectors,  and  Orthogonal  Transformations 

We  now  consider  some  results  which  at  first  glance  seem  unrelated  to  §§2  and  3.  We  exhibit 
the  connection  at  the  end  of  §5. 

We  have  defined  the  angle  0(a,6)  between  two  nonzero  vectors  a,  6  6  Sn  to  satisfy 

aTb 

co«e(a,t)  =  ||a|||||t||>.  0<  6(0,6)  <,.  (4.1) 

If  either  a  or  6  is  zero  the  angle  is  not  defined.  We  will  be  interested  in  orthogonal  transformations, 
and  the  following  obvious  result  indicates  how  angles  are  preserved. 

Lemma  4.1.  For  any  matrix  Qi(n  x  r)  such  that  Q2Qi  =  /,  and  any  nonzero  vectors  c,d  e  9tr 

9(Qic,Qid)  =  e(c,d).  (4.2) 

The  next  result  exhibits  a  more  interesting  connection  between  orthogonal  transformations 
and  angles,  and  provides  an  alternative  to  (4.1)  for  obtaining  angles. 
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Theorem  4.1.  Let  a,J6  S"  be  nonzero  vectors,  And  (a  b)  =  QiU  a  QR  factorization  with 


QTia  !)-("),  *-(«  £).  «*><».  «i°- 


And  <?  =  (<?!  Q2  )  an  n  x  n  orthogonal  matrix.  Consider  the  orthogonal  rotation  through  <t>  which 
transforms  U  to  lower  triangle,  with 


cos  4>  —  sin 


sin  4>  cos 


•S)  (m»)  ~  (*»)'  **2-(ri2  +  Mit)t/3>  0. 


Then  the  angle  between  a  and  b  may  be  obtained  from  either 


cos  0(a,  b)  =  sin  =  pi2/A22, 
sin  0(a,  b)  =  cos  <f>  =  p22/A22. 


The  proper  of  the  two  solutions  for  (4.6)  is  retrieved  by  having  the  angle  ©(a,  b)  satisfy  sign(f  - 
0(a,6))  =  sign(/i12). 


Proof.  From  Lemma  4.1,  (4.1),  and  (4.4) 


cos0(a,6)  = 


A*llA*12 


Pll(Pl3  +  Pti)l/2  *22 


A*12  .  , 

=  —  =  sin  <f>, 


the  last  step  coming  from  multiplying  (4.4)  by  the  transpose  of  the  rotation,  which  also  gives  p22  = 
*22  cos  showing  that  cos  ^  >  0.  But  from  (4.5)  sin0(a,6)  =  rfccos^,  whereas  the  definition  (4.1) 
ensures  sin0(a,6)  >  0  and  so  (4.6)  follows. 


To  illustrate  that  this  is  a  result  of  general  computational  value,  consider  Boating  point  com¬ 
putations  with  precision  6,  so  that  if  |7|  <  6  the  computed  result  of  adding  7  to  1  is  fl(l  +  7)  =  1. 
With 


(a  A )  =  ^  *  *  J  ,  tj  a  computer  number,  rj2  <  S 


computing  (4.1)  or  (4.5)  would  give 


fi(cos0(a,  A))  =  1, 


leading  us  to  believe  0(a,A)  =  0.  The  alternative  (4.6)  would  give 


fl(sin0(a,  A))  =  7, 


telling  us  0(a,A)  es  rj,  which  is  far  more  desirable. 

Note,  that  by  following  the  work  of  Wilkinson  [13]  the  computations  for  cos  0(a,  b )  in  (4.1)  and 
(4.5),  and  for  sin0(a,A)  in  (4.6),  can  be  shown  to  be  numerically  stable,  so  if  we  want  the  cosine 
or  sine  then  these  give  all  the  accuracy  we  can  expect.  However  |coe0|  «  1  is  relatively  insensitive 
to  small  changes  in  0,  and  if  we  want  ©  in  such  cases  then  we  should  avoid  (4.1)  and  (4.5).  On  the 
other  hand  ( sin  0|  fs  1  is  also  relatively  insensitive  to  small  changes  in  8 ,  and  (4.6)  should  not  be 
used  to  compute  0  in  such  cases.  Replacing  b  by  ( 7  1  )T  in  the  above  example  will  illustrate  this. 

In  summary,  if  we  want  to  compute  0(a,A)  we  could  use  (4.1)  or  (4.5)  when  |cos0(a,A)|  < 


and  (4.6)  otherwise.  If  the  choice  is  between  (4.5)  and  (4.6)  only,  this  criterion  would  lead  to  (4.5) 
when  j/^i2 1  <  a*jj  and  (4.6)  otherwise. 

Note  that  computing  (4.1)  takes  about  3n  flops  (for  a  definition  of  ‘flop’  see  [8],  p.  32),  while 
computing  (4.3)  takes  about  4n  flops  at  best,  [8],  p.  148. 

In  the  remaining  sections  we  will  only  discuss  the  computation  of  cosines,  and  so  either  (4.1)  or 
(4.5)  will  suffice.  However,  we  will  present  theory  related  to  angles  in  general,  so  anyone  intending 
to  compute  these  actual  angles  can  use  the  results  of  this  section. 

5.  The  QR  Factorization  and  Angles  Between  Projected  Vectors 

Suppose  A(n  x  m)  =  (a i  ...  am),  which  need  not  have  rank  m.  Here  we  will  partition 
A  =  (A  i  Aj  )  with  Ai(n  x  t  -  1),  and  describe  how  the  QR  factorization  produces  projections  of 
columns  of  Aj  onto  the  space  i?(Ai)_L.  We  will  later  show  that  partial  correlation  coefficients  are 
the  cosines  of  angles  between  such  projections  when  E  =  AT  A. 

Suppose  Q(n  x  n)  =  (Qi  )  is  an  orthogonal  matrix  such  that 

A  =  (AX  A,)  =  (Q1  Q»)(o  £),  (5.1) 

where  E[q  x  i  —  1)  is  of  rank  q.  This  could  be  computed  by  the  first  q  steps  of  the  QR  factorization, 
or  the  first  q  steps  of  the  modified  Gram-Schmidt  (MGS)  algorithm  (see  [8],  Chapter  6),  but  we 
only  require  the  zero  block  and  the  full  row  rank  of  E,  giving 

Ai  =  QiE,  A2  —  Q1F  +  Q2G. 

Let  us  define  the  two  symmetric  idempotent  matrices 

Pi  =  QiQl,  P2  =  Q*Q\. 

Since  E  has  full  row  rank,  f?(Ai)  =  R{Qi)  =  R{Pi)  so  Pi  is  the  orthogonal  projector  onto  if(Ai), 
P2  =  /  —  Pi  is  the  orthogonal  projector  onto  i?(Ai)x,  and 

P2A2  =  Q2G  =  A2-  QiF  (5.2) 

is  the  projection  of  the  columns  of  A2  onto  R(Ai)x.  That  is,  the  columns  of  QjG  correspond  to 
the  columns  of  Aj  after  the  dependence  on  oj,  . . .,  a,_i  has  been  removed,  and  it  is  the  correlation 
between  these  that  is  of  interest  in  defining  partial  correlations.  We  first  consider  the  angles  between 
these  vectors. 

Definition  5.1.  Let  h,  be  the  projection  of  ay  onto  the  space  orthogonal  to  aj,  . . .,  a,_i,  where 
A  =  (aj  ...  am) .  We  define  0y* (A)  to  be  the  angle  between  any  pair  of  nonzero  such  h,  and 

k*  1  1,  k  =  I ,  .  .  .  ,  TO. 

If  we  write  G  =  (y,-  ...  gm )  we  have  hj  =  Qjpy,  and  from  Lemma  4.1 

e!*(A)  =  &(Q29j,Q29k)  =  9(9], 9k)  if  11^112115*112^0.  (5.3) 

The  angle  is  undefined  if  either  projection  is  zero.  It  follows  that  these  angles  are  just  the  angles 
between  columns  of  G  in  (5.1). 

Lemma  4.1  in  §4  described  an  invariance  of  angles  under  certain  transformations,  and  the 
following  result  extends  this. 


Lemma  5.1.  For  any  matrix  Qi(p  x  n)  such  that  Q\Qi  =  /,  if  B  =  (6j  ...  bm)  =  Q\A  then 

0^(B)  is  defined  if  and  only  if  Q^(A)  is  defined,  and  in  this  case 

eg  (B)  =  e«jI(QtA)  =  e}2(A),  y,  *  =  •,... ,  m. 

Proof. 

B  =  Q,A  =  e,<j(*  £) 

has  the  same  (7  as  the  orthogonal  factorization  in  (5.1)  so  the  result  follows  from  (5.3). 


Since  (5.3)  only  depends  on  the  full  row  rank  of  E  and  the  zero  block  in  (5.1),  it  is  unaffected 
by  orthogonal  transformations  applied  to  the  left  of  (E  F ),  or  of  G.  In  particular,  orthogonal 
transformations  of  the  form  (4.3)  and  (4.4)  could  be  used  on  G  to  find  the  0^ (A). 

Note  that  q  steps  of  the  QR  factorization  for  computing  (5.2)  will  provide  Qi  and  G  separately, 
whereas  MGS  only  provides  QiG  when  it  is  stopped  after  q  steps.  From  (5.3)  we  see  either  will 
do  for  computing  angles.  In  [8],  Chapter  6  it  is  shown  that  for  a  complete  QR  factorization  the 
Householder  and  fast  Givens  algorithms  each  require  about  m2(n  —  m/3)  flops,  while  MGS  requires 
about  m2n.  Often  n  »  m  and  there  will  be  little  difference  in  cost,  but  we  would  generally  choose 
the  QR  factorization  for  its  excellent  numerical  properties. 

We  now  give  the  result  which  relates  the  work  in  §§2  and  3  with  that  in  §§4  and  5  so  far,  see 
[5],  and  motivated  the  lengthy  title  for  this  paper. 

Theorem  5.1.  If 

AtA  =  £  =  (*<■>),  (5.4) 

then  for  i  =  1, . . . ,  m  and  j,  k  =  i, . . 


,  m 


=  cos  Byi(A), 


when  either  of  these  is  defined.  Here  the  pQ  are  defined  by  (3.5),  where  the  oj'J  are  elements  of  the 
Schur  complement  E^  of  the  leading  principal  («  -  1)  x  (i  -  1)  submatrix  of  E,  and  arise  naturally 
in  the  Cholesky  factorization  (2.4)  of  E.  The  ©y*  {A)  are  as  in  Definition  5.1  above. 

When  E  is  a  scalar  multiple  of  the  (population  or  sample)  covariance  matrix  of  some  distribu¬ 
tion,  these  are  the  corresponding  partial  correlation  coefficients,  see  Definition  3.1.  As  a  result 
partial  correlation  coefficients  can  also  be  defined  in  terms  of  angles  between  projected  vectors. 
These  projected  vectors  arise  naturally  from  the  QR  factorization  of  A. 

Proof.  The  p'jl  in  (3.5)  are  a  function  of  the  elements  oj’j  of  2^  =  U^Uij  obtained  from  the 
Cholesky  factorization  (2.4), 


E  =  UTU 


-  (UUU“ 


u[,u13 


u*  Uu_  j  ( 


U[tu» 


UiiUu  is  (»  -  1  x  t  -  1), 


(5.5) 


where  these  products  are  unaffected  if  all  the  zero  rows  are  removed  from  U,  so  we  can  henceforth 
assume  Uu  has  full  row  rank. 

In  (5.3)  we  showed  that  0 (A)  is  the  angle  between  columns  g y  and  p*  of  G,  where  from  (5.4) 
and  (5.1) 

kTa_(EtE  EtF 
~\FtE  FtF  +  GtG 


where  E(q  x  t  -  1)  is  of  rank  q,  so  that  F  =  EC,  C  =  ET(EET)~1F. 

We  now  equate  (5.5)  and  (5.6)  to  show  GTG  =  U22U2 i-  The  (1,1)  and  (1,2)  blocks  give 

Un  =  {UnU[1)-1UuETE 

U12  =  (UnuTr'UnETF  =  UUC 

so  that 

U?2U12  =  CTUjlUnC  =  ctetec  =  ftf, 

and  the  desired  result  follows  from  equating  the  (2,2)  blocks.  This  shows  gjgt  =  er^,  and  combining 
(5.3),  the  definition  of  angle  (4.1),  and  (3.5),  proves  the  theorem. 

I 

This  result  is  essentially  based  on  the  uniqueness  relation  between  the  Cholesky  factor  of  AT  A 
and  the  upper  triangular  matrix  in  the  QR  factorization  of  A.  For  full  column  rank  A  the  proof 
would  be  briefer,  but  we  have  given  the  result  in  its  full  generality. 

To  summarize  one  practiced  advantage  of  this  theorem,  if  we  are  given  A  where  we  know 
E  =  AT A  is  the  sample  or  population  cova’-iance  matrix  of  a  distribution,  then  we  can  compute  the 
corresponding  partial  correlation  coefficients  pQ  directly  from  A,  first  by  carrying  out  a  QR-like 
factorization  of  the  form  (5.1),  then  finding  cos0(<;3,0*)  using  either  (4.1)  or  (4.3)  with  (4.5). 

6.  Observations  of  Random  Variables 

Suppose  we  have  m  random  variables  £i ,  . . .,  £m,  and  for  j  =  1, . . . ,  m,  xy  =  (  )T 

is  a  vector  of  n  observations  of  .  Then  for  j  =  1 , . . . ,  m 

ai  =  xi  -  £«(*>  «).  e  =  ( 1  ...  1  )r ,  (6.1) 

is  Zj  adjusted  for  its  mean,  and 


is  the  usual  estimate  for  the  covariance  matrix  of  the  distribution  of  these  random  variables,  and 
is  called  the  sample  covariance  matrix.  From  this  we  could  compute  the  sample  partial  covariance 
matrices  and  sample  partial  correlation  coefficients  exactly  as  in  §3,  and  these  would  be  estimates  of 
the  corresponding  population  values.  However  it  is  numerically  preferable  to  work  with  A  directly, 
and  so  these  could  also  be  computed  as  described  in  §§4  and  5. 

A  difficulty  with  this  approach  is  that  if  we  obtain  (  £n+i,i  . . .  £n+i,m ),  a  new  observation 
for  each  variable,  and  wish  to  update  our  values,  then  every  mean  has  to  be  adjusted  and  every 
element  of  A  is  changed,  as  well  as  having  a  new  row  added.  To  avoid  this,  note  that  (6.1)  can  be 
written 

1  X 

a.  =  Pi, ,  P  —  I - ee1 , 

n 

where  P  is  symmetric  and  idempotent,  and  is  the  orthogonal  projector  onto  the  space  orthogonal 
to  e.  But  from  (5.1)  and  (5.2),  one  step  of  the  QR  factorization 

(e  X)  =  (q  i  <?2)(^o  ,  X=(xi  ...  zm ) , 

gives  a  matrix  Q2B  whose  columns  are  the  projections  of  the  columns  of  X  onto  the  space  orthogonal 
to  e,  and  so  are  the  required  o3.  Now  since  A  =  PX  =  Q2Q2X  =  QiB,  and  all  the  values  we 


■A  A,  A  =  (oi 


want  can  be  found  from  AJ  A  =  BT B,  we  can  work  with  B  instead  of  A,  and  continuing  the  QR 
factorization  of  (e  X )  will  produce  everything  needed  in  §5. 

This  is  not  just  a  nice  way  of  effectively  adjusting  the  ij  for  their  means.  If  we  already  have 
the  QR  factorization  of  (  e  X ),  we  can  add  the  new  row  ( 1  £n+11  ...  £n+i,m)t°(e  X  )  and 

update  the  factorization  using  standard  efficient  and  numerically  reliable  techniques,  see  for  example 
[8]  §12.6.  We  can  also  update  the  factorization  efficiently  when  we  discard  a  row  of  ( e  X),  while 
adding  or  deleting  columns  (corresponding  to  random  variables)  is  particularly  easy.  We  could  also 
update  the  Cholesky  factor  of  given  £  when  a  row  and  corresponding  column  is  added  or  deleted, 
see  also  [7,  10].  It  is  a  straightforward  exercise  to  compute  the  new  partial  correlation  coefficients 
following  such  computations,  however  there  is  a  danger  of  numerical  errors  in  the  case  of  discarding 
a  row. 


7.  Some  Computational  Considerations 

So  far  we  have  considered  computing  p^J,  the  partial  correlation  coefficient  of  (}  and 
keeping  £i ,  . . .,  £,_i  fixed  where  j,k  =  m;  and  we  have  seen  how  to  compute  these  for  the 

fixed  ordering  i  =  1, . . .  ,m.  Of  course  any  initial  ordering  of  the  columns  of  X  in  §6  or  A  in  §5  or 
any  symmetric  ordering  of  columns  and  rows  of  £  could  be  chosen,  and  during  the  factorizations 
pivoting  can  always  be  used  on  the  unfactored  part  to  determine  which  coefficients  will  be  produced 
next. 

Let  us  write  pf-  where  S  is  some  set  of  indices,  to  denote  the  partial  correlation  coefficient  of 
and  keeping  £*  fixed  for  all  Jfc  €  S.  When  several  lots  of  such  coefficients  are  required  it  is 
usually  the  case  that  we  want  pf*,  k  =  1,2, ...  ,t,  where  Si  C  5j  C  C  S(.  In  this  case  we  can 
order  columns  of  A  for  example  to  give 

(Aj  At  ...  At  At+i)  =  A 


where  ( Ai  ...  Ak)  has  just  those  columns  with  indices  in  Sk,  k  —  l,...,f  and  At+J  contains 
the  remaining  columns.  Applying  the  QR  factorization  to  A  will  allow  us  to  compute  everything 
in  the  correct  order. 

If  we  already  have  upper  triangular  U  in  A  =  Q\U  (or  £  =  UTU),  and  we  wish  to  compute 
values  for  a  different  ordering,  we  can  always  reorder  columns  (and  rows)  and  update  U  to  produce 
the  required  form.  For  example  suppose  U  =  ( tij  ...  )  is  the  nonsingular  upper  triangular  5x5 

matrix  obtained  from  the  QR  factorization  of  A,  and  we  now  want  to  compute  pf  6  with  S  =  {2,4}. 
We  could  arrange  the  columns  and  transform  ( U2  U4  ui  U5 )  by  eliminating  elements  (2,1), 
(4,2)  and  (3,2): 


/ 


V 


* 

* 


* 

* 

* 

* 


F 

G 


) 


and  pf  B  is  the  cosine  of  the  angle  between  the  two  columns  of  G.  This  could  be  found  using  (4.1), 
or  by  applying  rotations  as  in  (4.3)  and  (4.4),  see  (4.5). 

We  would  tend  to  use  the  above  approach  to  maintain  as  much  numerical  information  as 
possible  when  we  initially  know  A  or  X,  but  when  we  are  given  £,  it  could  be  worth  considering 
more  efficient  ways  of  updating  U,  see  [7],  Note  that  after  i  —  1  steps  of  the  Cholesky  algorithm 
(2.4)  all  the  <rj.^  are  at  hand,  and  so  all  the  p^  in  (3.5)  can  be  computed  in  m-i  square  roots  and 
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(m  —  »)(m  — «  + 1)/2  multiplications  and  divisions.  On  the  other  hand  if  we  use  the  QR  factorization, 
or  already  have 

we  still  need  to  find  the  cosines  of  the  angles  between  the  columns  of  G.  IfGisn-i  +  lxm-i'+l 
and  full  this  would  take  (n  —  »  +  l)(m  —  t  +  l)(m  -  i)/2  flops  to  compute  all  the  gjgk  in  (4.1).  The 
cost  would  be  significantly  less  if  n  »  m  and  G  was  already  upper  triangular,  but  it  would  still 
be  on  the  order  of  (m  -  t)s,  and  so  this  QR  approach  is  inherently  more  costly  than  the  Cholesky 
approach  if  many  partial  correlation  coefficients  are  required.  However  there  are  situations  where 
the  loss  of  accuracy  in  forming  AT  A  can  be  significant,  as  we  now  show,  see  also  [5], 

Example*  Suppose  we  are  given 

/-1  1  0  \ 

±  1  -1  —2e 

Vi  <  <  >  +  <  ’ 

V  —e  -e  -1  +  eJ 


where  e  is  non-zero,  and  that  the  partial  correlation 


J*)  - 

P  23  — 


is  to  be  determined.  The  corresponding  covariance  matrix  AT A  is 


/  l  +  e2 

s- h+- 


-l  +  e2  0 

1  +  e2  2e 

2t  1  +  3c2 


and  in  exact  arithmetic,  one  has 


r(2)  _  2f  <r(2)  -  I  +  A  _  +  ff(2)  =  i  + 

r23  ~  ze>  ° 22  1  +  C  l  +  e2  ’  "3S  1  ^ 


so  that 


P'S  =8i«n(c)YY^j- 


However,  in  finite  precision  floating  point  arithmetic  and  with  e  chosen  to  be  sufficiently  small 
(2e  <  61/2,  where  S  is  the  floating  point  precision  so  that  the  computed  fl(l  ±  4e2)  =  1),  the 
computed  quantities  turn  out  to  be 

(  1  -1  °\ 

fl(E)  =1-1  1  2e  I 

I  0  2e  1  / 


=  0,  <7^  =  1 


so  that  pS  is  not  a  finite  number. 


Performing  a  QR  decomposition  of  the  matrix  A  yields  a  4  x  3  matrix  Q  with  orthonormal 
columns  and  a  3  x  3  upper  triangular  factor 


U  = 


vi  + 


1  +  e2  -1  +  e2  0 

0  2je|  sign(e)(l  +  e2) 

0  0  |e|v/2(l  +  £*) 


in  exact  arithmetic.  In  finite  precison  arithmetic  with  the  same  choice  of  e  as  before,  one  has 

/  1  — 1  0 
fl(tf)  =  [  0  2|e|  sign(e) 

VO  0  y/2\e\ 

and  the  sine  of  the  rotation  that  transforms  the  submatrix 

/  2|f|  sign(f)  \ 

V  0  y/2\e\  ) 

to  lower  triangular  form  is  equal  to  =  Bign(e),  the  exact  value  to  machine  precision. 


8.  Conclusions 


In  this  paper  we  have  shown  several  relationships  between  theoretical  objects  and  algorithms, 
and  used  these  to  suggest  approaches  to  numerical  computations.  Several  different  topics  have  been 
considered,  and  it  is  important  to  see  how  they  all  fit  together. 

It  is  a  standard  result  that  the  Cholesky  factors  of  the  Schur  complements  E^'l  of  all  leading 
principal  submatrices  of  symmetric  positive  definite  E  are  produced  by  the  Cholesky  factorization 
of  E.  It  is  probably  less  well  known  (at  least  we  know  of  no  reference)  that  not  only  the  Cholesky 
factors,  but  also  the  Schur  complements  themselves  are  produced  directly  (that,  without  having 
to  form  the  products  of  their  factors)  with  the  correct  organization  of  the  Cholesky  algorithms, 
see  (2.4).  We  have  extended  this  result  to  show  it  also  holdB  for  singular  E  when  we  consider  the 
generalized  Schur  complements  E^'V  This  is  summarized  in  Result  2.1. 

When  E  =  (<r,y)  above  is  the  nonsingular  covariance  matrix  of  a  vector  of  random  variables 
x  ~  (a,  E),  then  =  *«, /(*«<*»  )1/a  are  the  correlation  coefficients.  The  Schur  complement 
£(•)  =  (<yj.'J )  ,  j,  k  =  »,...  ,m  of  the  leading  (t  -  1)  x  (i  —  1)  submatrix  of  E  is  called  the  partial 
covariance  of  elements  i  tc  "  of  x,  given  the  first  *  —  1  elements.  The  corresponding  partial 
correlation  coefficients  are  then  —  <r^/(<r  I'/ a**)1/2-  Since  the  are  given  directly  by  the 
Cholesky  algorithm  (2.4),  the  pQ  can  be  computed  from  them.  We  have  shown  that  even  when 


E  is  singular,  the  generalized  Schur  complement  E^  given  by  the  Cholesky  algorithm  (2.4)  is  still 
the  required  partial  covariance  matrix,  and  so  the  partial  correlation  coefficients  can  be  computed 
as  indicated  above  whenever  nonzero.  If  x  has  a  multivariate  normal  distribution  these 

partial  covariances  and  correlations  are  also  conditional  covariances  and  correlations. 

If  A  is  available  where  E  =  ATA,  then  the  upper  triangular  Cholesky  factor  U  of  E  is  given  by 
the  QR  factorization  of  A.  It  was  shown  in  [5]  how  partial  correlation  coefficients  could  be  computed 
from  U  without  forming  UTU.  In  §5  we  rephrased  those  results  to  emphasize  that  partial  correlation 
coefficients  are  the  cosines  of  angles  between  vectors  appearing  in  the  QR  factorization  of  A.  As 
a  result,  partial  correlation  coefficients  for  different  sets  of  given  elements  of  the  random  vector  x 
can  be  obtained  by  permuting  the  columns  of  A  and  updating  the  QR  factorization,  see  [5],  Again 
this  did  not  require  E  to  be  nonsingular. 


13 


The  two  main  sequences  of  relationships  have  therefore  been,  for  a  (possibly  singular)  covari¬ 
ance  matrix  E  =  AT  A: 

1.  Cholesky  factorization  of  E,  as  implemented  in  (2.4) 

•  (generalized)  Schur  complements  of  leading  submatrices  of  E 

•  partial  covariance  matrices 

•  partial  correlation  coefficients. 

2.  QR  (or  MGS)  factorization  of  A 

•  projections  of  later  columns  of  A  onto  spaces  orthogonal  to  earlier  columns 

•  cosines  of  angles  between  these  projections 

•  partial  correlation  coefficients. 

Of  course  both  the  Cholesky  factorization  of  E  and  the  QR  factorization  of  A  produce  Cholesky 
factors  of  each  leading  principal  submatrix  of  E  and  of  the  corresponding  (generalized)  Schur 
complement,  but  these  are  well  known  and  not  part  of  the  above  sequences. 

When  the  population  covariance  matrix  for  the  random  vector  z  is  not  available,  and  A  is 
obtained  from  n  observations  of  each  of  the  random  variables  as  shown  in  §6,  then  E  =  (n— 1)~ 1 AT  A 
is  the  sample  covariance  matrix.  The  sample  partial  covariance  matrices  are  the  generalized  Schur 
complements  of  the  principal  submatrices  of  E,  and  the  sample  partial  correlation  coefficients  can 
be  found  from  the  Cholesky  factorization  of  E  or  the  QR  factorization  of  A,  just  as  was  described 
for  the  population  partial  correlation  coefficients.  In  this  case  however,  we  may  obtain  further 
observations  on  each  random  variable,  and  §6  indicated  how  the  computations  may  be  arranged 
to  include  these,  and  update  the  factorization  and  corresponding  partial  correlation  coefficients. 
Ways  of  computing  different  sets  of  partial  correlation  efficiently  were  considered  in  §7. 

Reliable  numerical  computation  has  been  a  guide  in  our  search  for  connections  and  algorithms 
here,  and  in  §4  we  made  some  general  observations  on  computing  angles  between  vectors,  and  the 
possible  effect  of  finite  precision.  More  importantly  it  was  pointed  out  in  [5]  and  repeated  in  §7 
here,  how  working  with  the  QR  factorization  of  A  rather  than  the  Cholesky  factorization  of  AT A 
can  be  numerically  crucial  in  some  cases  of  computing  partial  correlation  coefficients,  even  though 
the  At A  approach  may  be  faster. 

In  a  forthcoming  paper  will  be  presented  a  derivation  of  this  QR  approach  from  the  parallel 
implementation  of  an  algorithm  for  solving  symmetric  positive  definite  linear  systems  by  hyperbolic 
rotations,  as  shown  in  [4] . 
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